Methods for seamless nucleic acid assembly

ABSTRACT

Provided herein are methods, systems, and compositions for seamless nucleic acid assembly. Such methods, systems, and compositions for seamless nucleic acid assembly include those for in vitro recombination cloning, single-stranded hierarchal DNA assembly, or overlap extension PCR without primer removal.

CROSS-REFERENCE

This application is a divisional of U.S. patent application Ser. No.16/712,678, filed Dec. 12, 2019, which is a continuation ofPCT/US2018/37152 filed Jun. 12, 2018, which claims the benefit of U.S.Provisional Patent Application No. 62/518,489 filed on Jun. 12, 2017,which are incorporated herein by reference in their entirety.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy, created on Jun. 6, 2018, isnamed 44854-743_401_SL.txt and is 1,591 bytes in size.

BACKGROUND

De novo nucleic acid synthesis is a powerful tool for basic biologicalresearch and biotechnology applications. While various methods are knownfor the synthesis of relatively short fragments of nucleic acids in asmall scale, these techniques suffer from scalability, automation,speed, accuracy, and cost. Thus, a need remains for efficient methods ofseamless nucleic acid assembly.

BRIEF SUMMARY

Provided herein is a method for nucleic acid assembly, comprising: (a)providing at least one double stranded nucleic acid comprising in 5′ to3′ order: a 5′ flanking adapter sequence, a first homology sequence, aninsert sequence, a second homology sequence, and a 3′ flanking adaptersequence, wherein the first homology sequence and the second homologysequence comprises about 20 to about 100 base pairs in length; (b)providing a vector comprising the first homology sequence and the secondhomology sequence; and (c) mixing the at least one double strandednucleic acid and the vector with a bacterial lysate. Further providedherein is a method, wherein the bacterial lysate comprises a nuclease ora recombinase. Further provided herein is a method, wherein thebacterial lysate comprises a nuclease and a recombinase. Furtherprovided herein is a method, wherein the first homology sequence and thesecond homology sequence each comprises about 20 base pairs. Furtherprovided herein is a method, wherein the first homology sequence and thesecond homology sequence each comprises about 41 base pairs. Furtherprovided herein is a method, wherein the first homology sequence and thesecond homology sequence each comprises 30 to 50 base pairs. Furtherprovided herein is a method, wherein the first homology sequence and thesecond homology sequence each comprises 35 to 45 base pairs. Furtherprovided herein is a method, wherein the first homology sequence and thesecond homology sequence each comprises about 100 base pairs. Furtherprovided herein is a method, wherein the first homology sequence or thesecond homology sequence is flanked by the 5′ flanking adapter sequenceand the 3′ flanking adapter sequence. Further provided herein is amethod, wherein the first homology sequence or the second homologysequence is at a terminal end. Further provided herein is a method,wherein a percentage of correct assembly is at least 65%, 70%, 75%, 80%,85%, 90%, 95%, or 99%.

Provided herein is a method for nucleic acid assembly comprising: (a) denovo synthesizing a plurality of polynucleotides, wherein eachpolynucleotide comprises a first homology region that comprises in 5′ to3′ order: a 5′ flanking adapter sequence, a first homology sequence, aninsert sequence, a second homology sequence, and a 3′ flanking adaptersequence, wherein the first homology sequence and the second homologysequence each comprises about 20 to about 100 base pairs in length, andwherein each polynucleotide comprises a homology sequence identical tothat of another polynucleotide of the plurality of polynucleotides; and(b) mixing of the plurality of polynucleotides with a bacterial lysateto processively form nucleic acids each having a predetermined sequence.Further provided herein is a method, wherein the bacterial lysatecomprises a nuclease. Further provided herein is a method, wherein thefirst homology sequence and the second homology sequence each comprisesabout 20 base pairs. Further provided herein is a method, wherein thefirst homology sequence and the second homology sequence each comprisesabout 41 base pairs. Further provided herein is a method, wherein thefirst homology sequence and the second homology sequence each comprisesabout 100 base pairs. Further provided herein is a method, wherein thefirst homology sequence or the second homology sequence is flanked bythe 5′ flanking adapter sequence and the 3′ flanking adapter sequence.Further provided herein is a method, wherein the first homology sequenceor the second homology sequence is at a terminal end. Further providedherein is a method wherein a percentage of correct assembly is at least65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99%.

Provided herein is a method for nucleic acid assembly comprising: (a)providing a first double stranded nucleic acid and a second doublestranded nucleic acid, wherein the first double stranded nucleic acidcomprises in 5′ to 3′ order: a 5′ flanking adapter sequence, an insertsequence, a homology sequence, and a 3′ flanking adapter sequence, andwherein the second double stranded nucleic acid comprises in 5′ to 3′order: a 5′ flanking adapter sequence, an insert sequence, a homologysequence, and a 3′ flanking adapter sequence; (b) annealing a uracilseparately to each of (i) a 5′ end and a 3′ end of the first doublestranded nucleic acid and (ii) a 5′ end and a 3′ end of the seconddouble stranded nucleic acid; (c) amplifying the first double strandednucleic acid and the second double stranded nucleic acid using a uracilcompatible polymerase to form amplification products; (d) mixing theamplification products to form a mixture; and (e) amplifying the mixtureusing a uracil incompatible polymerase to generate the nucleic acid.Further provided herein is a method, wherein the uracil incompatiblepolymerase is a DNA polymerase. Further provided herein is a method,wherein a plurality of double stranded nucleic acids is provided.Further provided herein is a method, wherein the homology sequencecomprises about 20 to about 100 base pairs. Further provided herein is amethod, wherein the homology sequence comprises about 20 base pairs.Further provided herein is a method, wherein the homology sequencecomprises about 41 base pairs. Further provided herein is a method,wherein the homology sequence comprises about 100 base pairs. Furtherprovided herein is a method, wherein a percentage of correct assembly isat least 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99%.

Provided herein is a method for nucleic acid assembly comprising: (a)providing predetermined sequences for a first double stranded nucleicacid and a second double stranded nucleic acid, wherein the first doublestranded nucleic acid comprises in 5′ to 3′ order: a 5′ flanking adaptersequence, an insert sequence, a homology sequence, and a 3′ flankingadapter sequence, and wherein the second double stranded nucleic acidcomprises in 5′ to 3′ order: a 5′ flanking adapter sequence, an insertsequence, a homology sequence, and a 3′ flanking adapter sequence; (b)synthesizing a plurality of polynucleotides encoding for thepredetermined sequences; (c) annealing a universal primer comprisinguracil at a terminal end of the first double stranded nucleic acid andthe second double stranded nucleic acid; (d) amplifying the first doublestranded nucleic acid and the second double stranded nucleic acid usinga uracil incompatible polymerase to form amplification products; (e)mixing the amplification products to form a mixture; and (f) amplifyingthe mixture to generate the nucleic acid. Further provided herein is amethod, wherein the uracil incompatible polymerase is a DNA polymerase.Further provided herein is a method, wherein a plurality of doublestranded nucleic acids is provided. Further provided herein is a method,wherein the homology sequence comprises about 20 to about 100 basepairs. Further provided herein is a method, wherein the homologysequence comprises about 20 base pairs. Further provided herein is amethod, wherein the homology sequence comprises about 41 base pairs.Further provided herein is a method, wherein the homology sequencecomprises about 100 base pairs. Further provided herein is a method,wherein a percentage of correct assembly is at least 65%, 70%, 75%, 80%,85%, 90%, 95%, or 99%.

Provided herein is a method for nucleic acid assembly comprising: (a)providing a plurality of double stranded nucleic acids; (b) annealing auracil at a 5′ end and a 3′ end of at least two of the double strandednucleic acids; (c) amplifying the double stranded nucleic acids using auracil compatible polymerase to form amplification products; (d) mixingthe amplification products from step (c) to form a mixture; and (e)amplifying the mixture from step (d) using a uracil incompatiblepolymerase to generate a single-stranded nucleic acid. Further providedherein is a method, wherein the uracil incompatible polymerase is a DNApolymerase. Further provided herein is a method, wherein a plurality ofdouble stranded nucleic acids is provided. Further provided herein is amethod, wherein the homology sequence comprises about 20 to about 100base pairs. Further provided herein is a method, wherein the homologysequence comprises about 20 base pairs. Further provided herein is amethod, wherein the homology sequence comprises about 41 base pairs.Further provided herein is a method, wherein the homology sequencecomprises about 100 base pairs. Further provided herein is a method,wherein a percentage of correct assembly is at least 65%, 70%, 75%, 80%,85%, 90%, 95%, or 99%.

INCORPORATION BY REFERENCE

All publications, patents, and patent applications mentioned in thisspecification are herein incorporated by reference to the same extent asif each individual publication, patent, or patent application wasspecifically and individually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 depicts a schematic of single-stranded DNA mediated hierarchalassembly with two fragments.

FIG. 2 depicts a schematic of single-stranded DNA mediated hierarchalassembly with three fragments.

FIG. 3 depicts a schematic for in vitro recombination cloning with asingle gene fragment.

FIG. 4 depicts a schematic for in vitro recombination cloning with twogene fragments.

FIG. 5 depicts gene fragment designs with varying lengths ofnon-homologous sequences.

FIGS. 6A-6B depict gene fragment designs of internal homology sequences.

FIG. 7 depicts a workflow for in vitro recombination cloning.

FIG. 8 depicts a schematic of overlap extension polymerase chainreaction without primer removal.

FIG. 9 depicts systems for polynucleotide synthesis and seamless nucleicacid assembly.

FIG. 10 illustrates a computer system.

FIG. 11 is a block diagram illustrating architecture of a computersystem.

FIG. 12 is a block diagram of a multiprocessor computer system using ashared virtual address memory space.

FIG. 13 is a diagram demonstrating a network configured to incorporate aplurality of computer systems, a plurality of cell phones and personaldata assistants, and Network Attached Storage (NAS).

FIG. 14 is a plot of correct assembly (black bars) and incorrectassembly (white bars) following overlap extension polymerase chainreaction without primer removal for two genes (Gene 1 and Gene 2).Homology sequence length includes 20, 41, and 100 base pairs.

FIG. 15 is a plot of correct assembly (black bars) and incorrectassembly (white bars) following single-stranded DNA mediated hierarchalassembly using Q5 DNA polymerase and KapaHiFi polymerase enzymes.Homology sequence length includes 20, 41, and 100 base pairs.

FIG. 16 is a plot of colony forming units (CFU, Y-axis) versus insert:vector ratio (X-axis). An amount of insert includes 13 fmol (whitebars), 26 fmol (hashed bars), and 40 fmol (black bars).

FIG. 17 is an image capture of a capillary gel electrophoresis followingin vitro recombination cloning.

FIG. 18 is a plot of colony forming units (CFU) of homology sequencescomprising 20, 41, or 100 base pairs. Homology sequences are flanked byuniversal primers (internal) or at a 5′ or 3′ end of an insert(terminal).

FIG. 19 is a plot of correct assembly (black bars) and incorrectassembly (white bars) following in vitro recombination cloning. Homologysequences comprise 20, 41, or 100 base pairs and are flanked byuniversal primers (internal) or at a 5′ or 3′ end of an insert(terminal).

FIG. 20A is a plot of percentage of hierarchal assembly (HA) fornon-homologous sequences comprising 0, 24, 124, or 324 base pairlengths.

FIG. 20B is a plot of colony forming units (CFU) for non-homologoussequences comprising 0, 24, 124, or 324 base pair lengths.

FIG. 21A is a plot of percentage of hierarchal assembly (HA) forinternal sequences comprising 24, 124, or 324 base pair lengths.

FIG. 21B is a plot of colony forming units (CFU) for internal sequencescomprising 24, 124, or 324 base pair lengths.

DETAILED DESCRIPTION Definitions

Throughout this disclosure, various embodiments are presented in a rangeformat. It should be understood that the description in range format ismerely for convenience and brevity and should not be construed as aninflexible limitation on the scope of any embodiments. Accordingly, thedescription of a range should be considered to have specificallydisclosed all the possible subranges as well as individual numericalvalues within that range to the tenth of the unit of the lower limitunless the context clearly dictates otherwise. For example, descriptionof a range such as from 1 to 6 should be considered to have specificallydisclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual valueswithin that range, for example, 1.1, 2, 2.3, 5, and 5.9. This appliesregardless of the breadth of the range. The upper and lower limits ofthese intervening ranges may independently be included in the smallerranges, and are also encompassed within the invention, subject to anyspecifically excluded limit in the stated range. Where the stated rangeincludes one or both of the limits, ranges excluding either or both ofthose included limits are also included in the invention, unless thecontext clearly dictates otherwise.

The terminology used herein is for the purpose of describing particularembodiments only and is not intended to be limiting of any embodiment.As used herein, the singular forms “a,” “an” and “the” are intended toinclude the plural forms as well, unless the context clearly indicatesotherwise. It will be further understood that the terms “comprises”and/or “comprising,” when used in this specification, specify thepresence of stated features, integers, steps, operations, elements,and/or components, but do not preclude the presence or addition of oneor more other features, integers, steps, operations, elements,components, and/or groups thereof. As used herein, the term “and/or”includes any and all combinations of one or more of the associatedlisted items.

Unless specifically stated or obvious from context, as used herein, theterm “nucleic acid” as used herein encompass double- or triple-strandednucleic acids, as well as single-stranded molecules. In double- ortriple-stranded nucleic acids, the nucleic acid strands need not becoextensive (i.e., a double-stranded nucleic acid need not bedouble-stranded along the entire length of both strands). Nucleic acidsequences, when provided, are listed in the 5′ to 3′ direction, unlessstated otherwise. Methods described herein provide for the generation ofisolated nucleic acids. Methods described herein additionally providefor the generation of isolated and purified nucleic acids. A “nucleicacid” as referred to herein can comprise at least 5, 10, 20, 30, 40, 50,60, 70, 80, 90, 100, 125, 150, 175, 200, 225, 250, 275, 300, 325, 350,375, 400, 425, 450, 475, 500, 600, 700, 800, 900, 1000, 1100, 1200,1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, or more bases in length.Moreover, provided herein are methods for the synthesis of any number ofpolypeptide-segments encoding nucleotide sequences, including sequencesencoding non-ribosomal peptides (NRPs), sequences encoding non-ribosomalpeptide-synthetase (NRPS) modules and synthetic variants, polypeptidesegments of other modular proteins, such as antibodies, polypeptidesegments from other protein families, including non-coding DNA or RNA,such as regulatory sequences e.g. promoters, transcription factors,enhancers, siRNA, shRNA, RNAi, miRNA, small nucleolar RNA derived frommicroRNA, or any functional or structural DNA or RNA unit of interest.The following are non-limiting examples of polynucleotides: coding ornon-coding regions of a gene or gene fragment, intergenic DNA, loci(locus) defined from linkage analysis, exons, introns, messenger RNA(mRNA), transfer RNA, ribosomal RNA, short interfering RNA (siRNA),short-hairpin RNA (shRNA), micro-RNA (miRNA), small nucleolar RNA,ribozymes, complementary DNA (cDNA), which is a DNA representation ofmRNA, usually obtained by reverse transcription of messenger RNA (mRNA)or by amplification; DNA molecules produced synthetically or byamplification, genomic DNA, recombinant polynucleotides, branchedpolynucleotides, plasmids, vectors, isolated DNA of any sequence,isolated RNA of any sequence, nucleic acid probes, and primers. cDNAencoding for a gene or gene fragment referred herein may comprise atleast one region encoding for exon sequences without an interveningintron sequence in the genomic equivalent sequence.

Unless specifically stated or obvious from context, as used herein, theterm “about” in reference to a number or range of numbers is understoodto mean the stated number and numbers+/−10% thereof, or 10% below thelower listed limit and 10% above the higher listed limit for the valueslisted for a range.

Seamless Assembly of Nucleic Acids

Provided herein are methods for assembly of nucleic acids with increasedefficiency and accuracy. Further provided herein are methods of assemblyof nucleic acids into long genes. De novo synthesized polynucleotides asdescribed herein are assembled into nucleic acids by in vitrorecombination cloning, single-stranded DNA mediated hierarchal assembly,or overlap extension. Generally, methods for nucleic acid assembly asdescribed herein do not require primer removal.

A first exemplary process for seamless assembly of nucleic acids isdepicted in FIG. 1. Single-stranded DNA mediated hierarchal assembly isperformed with a first gene fragment 102 and second gene fragment 104comprising a homology sequence 105. In this workflow, the first genefragment 102 and the second gene fragment 104 are double-stranded andcomprise a 5′ flanking adapter sequence 107 a and a 3′ flanking adaptersequence 107 b comprising uracil 103. The first gene fragment 102 andthe second gene fragment 104 are amplified with primers and a uracilcompatible polymerase. In some instances, the uracil compatiblepolymerase is Phusion U or Kapa Uracil. The resultant PCR productcomprises a uracil at the end of the 3′ flanking adapter sequence 107 b.The first gene fragment 102 and the second gene fragment 104 arediluted, mixed, and amplified 109 with a primer and a uracilincompatible polymerase that stalls at a uracil. In some instances, theuracil incompatible polymerase is Q5 DNA polymerase. The resultantfragments 106, 108 that do not comprise uracil, serve as primers foreach other and are combined 113 and amplified 115 to generate asingle-stranded DNA molecule. Single-stranded DNA mediated hierarchalassembly can be performed with multiple gene fragments as seen in FIG.2. Single-stranded DNA mediated hierarchal assembly is performed with afirst gene fragment 202, a second gene fragment 204, and a third genefragment 206 comprising a homology sequence 205. The first gene fragment202, the second gene fragment 204, and the third gene fragment 206 aredouble-stranded and comprise a 5′ flanking adapter sequence 207 a and a3′ flanking adapter sequence 207 b. The 3′ flanking adapter sequence 207b of the first gene fragment 202 comprises uracil. The 5′ flankingadapter sequence 207 a and the 3′ flanking adapter sequence 207 b of thesecond gene fragment 204 comprise a uracil. The 3′ flanking adaptersequence 207 b of the third gene fragment 206 comprises uracil. Thefirst gene fragment 202, the second gene fragment 204, and the thirdgene fragment 206 are amplified with universal primers (primers that arecomplementary to a region of each of the gene fragments) and a uracilcompatible polymerase. In some instances, the uracil compatiblepolymerase is Phusion U or Kapa Uracil. The resultant PCR productcomprises a uracil at the end of the at least one of the 5′ flankingadapter sequence 207 a and the 3′ flanking adapter sequence 207 b. Thefirst gene fragment 202, the second gene fragment 204, and the thirdgene fragment 206 are diluted, mixed, and amplified 209 with universalprimers and a uracil incompatible polymerase that stalls at a uracil oris inefficient when interacting with a uracil. In some instances, theuracil incompatible polymerase is Q5 DNA polymerase. The resultantfragment without 208 uracil and fragment comprising uracil 210 serve asprimers for each other. The resultant fragment without 208 uracil andfragment comprising uracil 210 are then combined 213 and diluted, mixed,and then amplified 215 with universal primers and DNA polymerase thatstalls at uracil (such as Q5 DNA polymerase) to generate an intermediatefragment 212. Intermediate fragment 212 and an additional fragment 214serve as primers for each other and are combined and amplified 219 togenerate a single-stranded DNA molecule.

A second exemplary process for seamless assembly of nucleic acids isdepicted in FIG. 3. In vitro recombination cloning is performed with afirst gene fragment 302 comprising from 5′ to 3′: a 5′ flanking adaptersequence 307 a, a first homology sequence 303, an insert sequence 305, asecond homology sequence 309, and a 3′ flanking adapter sequence 307 b.The first homology sequence 303 is homologous to sequence 311 of vector304. The second homology sequence is homologous to sequence 313 ofvector 304. The first gene fragment 302 and vector 304 are incubated 317with bacterial cell lysate to generate assembled construct 306.

In vitro recombination cloning can be performed with multiple genefragments as seen in FIG. 4. In vitro recombination cloning is performedusing two gene fragments. A first gene fragment 402 comprises from 5′ to3′: a 5′ flanking adapter sequence 407 a, a first homology sequence 403,an insert sequence 405, a second homology sequence 409, and a 3′flanking adapter sequence 407 b. A second gene fragment 404 comprisesfrom 5′ to 3′: a 5′ flanking adapter sequence 407 a, a first homologysequence 411, an insert sequence 413, a second homology sequence 415,and a 3′ flanking adapter sequence 407 b. The first homology sequence403 of the first gene fragment 402 is homologous to sequence 417 onvector 406. The second homology sequence 409 of the first gene fragment402 is homologous to the first homology sequence 411 of the second genefragment 404. The second homology sequence 415 of the second genefragment 404 is homologous to the sequence 419 of vector 406. The firstgene fragment 402, the second gene fragment 404, and vector 406 areincubated 419 with bacterial cell lysate to generate assembled construct408.

A third exemplary process for seamless assembly of nucleic acids isdepicted in FIG. 8. Overlap extension PCR is performed using a nucleicacid 802 comprising a universal primer binding site 803 and a regioncomplementary to nucleic acid 804. Nucleic acid 804 comprises auniversal primer binding site 803. An enzyme 805 cleaves a terminal endof nucleic acid 802 and nucleic acid 804. Nucleic acid 802 and nucleicacid 804 are then amplified and serve as a template for each other.

Primers referred to in the exemplary workflows mentioned herein as“universal primers” are short polynucleotides that recognize a primerbinding site common to multiple DNA fragments. However, these workflowsare not limited to only use of universal primers, and fragment-specificprimers may be incorporated in addition or alternatively. In addition,while exemplary workflows described herein refer to assembly of genefragments, they are not limited as such and are applicable to theassembly of longer nucleic acids in general.

In Vitro Recombination Cloning

Provided herein are methods for seamless assembly of nucleic acids,comprising in vitro recombination cloning. In some instances, in vitrorecombination cloning comprises a gene or fragment thereof for insertioninto a vector using a bacterial lysate. In some instances, the genefragment comprises at least one universal primer. In some instances, thegene fragment comprises a vector homology sequence.

Provided herein are methods for in vitro recombination cloning, whereina bacterial lysate is used. The bacterial lysate may be derived fromEscherichia coli. In some instances, the bacterial lysate is derivedfrom RecA⁻ bacteria. In some instances, the bacterial strain is fromJM109 cells. In some instances, the bacterial lysate comprises anuclease or a recombinase. In some instances, the bacterial lysatecomprises a nuclease and a recombinase.

Provided herein are methods for in vitro recombination cloning, whereina gene fragment is de novo synthesized and comprise a flanking adaptersequence and a homology sequence. The homology sequence may be at a 5′or 3′ end of the gene fragment. In some instances, the homology sequenceis flanked by a pair of flanking adapter sequences. In some instances,the gene fragment comprises a least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ormore than 10 homology sequences.

Homology sequences described herein for in vitro recombination cloningmay vary in length. Exemplary lengths for homology sequences include,but are not limited to, at least or about 10, 20, 30, 40, 50, 60, 70,80, 90, 100, 125, 150, 175, 200, or more than 200 base pairs. In someinstances, the length of the homology sequence is 20 base pairs. In someinstances, the length of the homology sequence is 41 base pairs. In someinstances the length of the homology sequence is 100 base pairs. In someinstances, the length of the homology sequence has a range of about 10to 20, 10 to 30, 10 to 40, 10 to 50, 10 to 60, 10 to 70, 10 to 80, 10 to100, 10 to 125, 10 to 150, 10 to 200, 20 to 30, 20 to 40, 20 to 50, 20to 60, 20 to 70, 20 to 80, 20 to 100, 20 to 125, 20 to 150, 20 to 200,30 to 40, 30 to 50, 30 to 60, 30 to 70, 30 to 80, 30 to 100, 30 to 125,30 to 150, 30 to 200, 40 to 50, 40 to 60, 40 to 70, 40 to 80, 40 to 100,40 to 125, 40 to 150, 40 to 200, 50 to 60, 50 to 70, 50 to 80, 50 to100, 50 to 125, 50 to 150, 50 to 200, 60 to 70, 60 to 80, 60 to 100, 60to 125, 60 to 150, 60 to 200, 70 to 80, 70 to 100, 70 to 125, 70 to 150,70 to 200, 80 to 100, 80 to 125, 80 to 150, 80 to 200, 100 to 125, 100to 150, 100 to 200, 125 to 150, 125 to 200, or 150 to 200 base pairs.

Provided herein are methods for in vitro recombination cloning, whereina number of gene fragments are inserted into a vector. In someinstances, the number of gene fragments that are inserted is at least orabout 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 gene fragments. Insome instances, the number of gene fragments that are inserted has arange of about 1 to 2, 1 to 3, 1 to 4, 1 to 5, 1 to 6, 1 to 7, 1 to 8, 1to 9, 1 to 10, 2 to 3, 2 to 4, 2 to 5, 2 to 6, 2 to 7, 2 to 8, 2 to 9, 2to 10, 3 to 4, 3 to 5, 3 to 6, 3 to 7, 3 to 8, 3 to 9, 3 to 10, 4 to 5,4 to 6, 4 to 7, 4 to 8, 4 to 9, 4 to 10, 5 to 6, 5 to 7, 5 to 8, 5 to 9,5 to 10, 6 to 7, 6 to 8, 6 to 9, 6 to 10, 7 to 8, 7 to 9, 7 to 10, 8 to9, 8 to 10, or 9 to 10.

Provided herein are methods for in vitro recombination cloning, whereina gene fragment comprises a non-homologous sequence. In some instances,the non-homologous sequence comprises at least 10, 20, 30, 40, 50, 60,70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280, 300 ormore than 300 base pairs in length. In some instances, the number ofbase pairs is 24 base pairs. In some instances, the number of base pairsis 124 base pairs. In some instances the number of base pairs is 324base pairs. In some instances, the gene fragment does not comprise anon-homologous sequence.

Provided herein are methods for in vitro recombination cloning, whereinthe amount of gene fragment or the amount of vector varies. In someinstances, the amount of gene fragment is at least or about 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60,65, 70, 75, 80, 85, 90, 95, 100, or more than 100 femtomoles. In someinstances, the amount of vector is at least or about 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 11, 12, 13, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70,75, 80, 85, 90, 95, 100, or more than 100 femtomoles. In some instances,a ratio of gene fragment to vector varies. In some instances, the molarratio of gene fragment to vector is at least or about 1:1, 2:1, 3:1,4:1, 5:1, 6:1, 7:1, 8:1, 9:1, 10:1, or more.

Provided herein are methods for in vitro recombination cloning, whereina reaction for in vitro recombination cloning occurs at an optimaltemperature. In some instances, the reaction occurs at a temperatureoptimal for enzymatic activity, for example, a temperature in a range ofabout 25-80° C. 25-70° C. 25-60° C. 25-50° C. or 25-40° C. In someinstances, the temperature is at least or about 15° C. 20° C. 25° C. 30°C. 35° C. 40° C. 45° C. 50° C. 55° C. 60° C. 65° C. 70° C. 75° C. 80° C.or more than 80° C. In some instances, the temperature is about 65° C.In some instances, the enzymatic activity is a nuclease activity. Insome instances, the enzymatic activity is a recombinase activity.

Methods described herein for in vitro recombination cloning result in ahigh percentage of correct assembly. In some instances, the percentageof correct assembly is at least or about 60%, 65%, 70%, 75%, 80%, 85%,90%, 95%, 97%, 99%, or more than 99%. In some instances, the percentageof correct assembly is 100%. In some instances, the percentage ofincorrect assembly is at most 5%, 10%, 15%, 20%, 25%, or 30%, or morethan 30%.

Methods described herein comprising in vitro recombination cloningresult in increased efficiency. In some instances, efficiency ismeasured by number of colony forming units. In some instances, methodsdescribed herein result in at least or about 100, 200, 300, 400, 500,600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000,9000, 10000, 12000, 14000, 16000, 18000, 20000, 25000, 30000, 35000,40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000,400000, 500000, 600000, 700000, 800000, 900000, or more than 900000colony forming units.

Single-stranded DNA Mediated Hierarchal Assembly

Provided herein are methods for seamless assembly of nucleic acids,wherein methods comprise single-stranded DNA mediated hierarchalassembly. In some instances, the single-stranded DNA mediated hierarchalassembly comprises assembly of a nucleic acid de novo synthesized bymethods described herein. In some instances, the assembly comprisesamplification of the nucleic acid with a primer, wherein the primer isnot removed after amplification. In some instances, assembly results inincreased percentage of correctly assembly nucleic acids and improvedefficiency.

Provided herein are methods for single-stranded DNA mediated hierarchalassembly, wherein methods comprise an amplification reaction. In someinstances, the amplification reaction comprises a polymerase. In someinstances, the polymerase is a high fidelity polymerase. In someinstances, the polymerase is a DNA polymerase. The DNA polymerase may befrom any family of DNA polymerases including, but not limited to, FamilyA polymerase, Family B polymerase, Family C polymerase, Family Dpolymerase, Family X polymerase, and Family Y polymerase. In someinstances, the DNA polymerase may be a Family B polymerase. ExemplaryFamily B polymerase is from a species of, but not limited to, Pyrococcusfuriosus, Thermococcus gorgonarius, Desulfurococcus strain Tok,Thermococcus sp. 9°N-7, Pyrococcus kodakaraensis, Thermococcuslitoralis, Methanococcus voltae, Pyrobaculum islandicum, Archaeoglobusfulgidus, Cenarchaeaum symbiosum, Sulfolobus acidocaldarius,Sulfurisphaera ohwakuensis, Sulfolobus solfataricus, Pyrodictiumoccultum, and Aeropyrum pernix. In some instances, the Family Bpolymerase is a polymerases or derivative thereof (e.g., mutants,chimeras) from Pyrococcus furiosus.

Polymerases described herein for use in an amplification reaction maycomprise various enzymatic activities. Polymerases are used in themethods of the invention, for example, to extend primers to produceextension products. In some instances, the DNA polymerase has 5′ to 3′polymerase activity. In some instances, the DNA polymerase comprises 3′to 5′ exonuclease activity. In some instances, the DNA polymerasecomprises proofreading activity. Exemplary polymerases include, but arenot limited to, DNA polymerase (I, II, or III), T4 DNA polymerase, T7DNA polymerase, Bst DNA polymerase, Bca polymerase, Vent DNA polymerase,Pfu DNA polymerase, and Taq DNA polymerase.

Polymerases described herein for use in an amplification reaction mayrecognize a modified base. In some instances, the modified base is avariation in nucleic acid composition or a chemical modification. Insome instances, a modified base comprises a base other than adenine,guanine, cytosine or thymine in DNA or a base other than adenine,guanine, cytosine or uracil in RNA. Modified bases described hereininclude, without limitation, oxidized bases, alkylated bases, deaminatedbases, pyrimidine derivatives, purine derivatives, ring-fragmentedbases, and methylated bases. Exemplary modified bases include, but arenot limited to, uracil, 3-meA (3-methyladenine), hypoxanthine, 8-oxoG(7,8-dihydro-8-oxoguanine), FapyG, FapyA, Tg (thymine glycol), hoU(hydroxyuracil), hmU (hydroxymethyluracil), fU (formyluracil), hoC(hydroxycytosine), fC (formylcytosine), 5-meC (5-methylcytosine), 6-meG(06-methylguanine), 7-meG (N7-methylguanine), εC (ethenocytosine), 5-caC(5-carboxylcytosine), 2-hA, EA (ethenoadenine), 5-fU (5-fluorouracil),3-meG (3-methylguanine), and isodialuric acid. In some instances, amodified base in DNA is a uracil. Non-limiting examples of uracilcompatible DNA polymerases include Pfu polymerase, Pfu Turbo Cx and KAPAHiFi Uracil+. In some instances, the polymerase selected for theamplification reaction is not capable of recognizing a modified base.For example, the polymerase is incompatible with uracil. Exemplarypolymerases that are incompatible with uracil include, but are notlimited to, KAPA HiFi polymerase, KAPA HiFi, Phusion®, and Q5® HighFidelity DNA polymerase.

In some instances, a single DNA polymerase or a plurality of DNApolymerases are used. In some instances, the same DNA polymerase or setof DNA polymerases are used at different stages of the present methods.For example, in a first amplification reaction a DNA polymerase that iscompatible with uracil is used, and in a second amplification reaction aDNA polymerase that is incompatible with uracil is used. In someinstances, the DNA polymerases are varied. For example, the DNApolymerases are varied based on enzymatic activities. In some instances,additional polymerases are added during various steps.

Described herein are methods for nucleic acid assembly comprising anamplification reaction, wherein the amplification reaction comprises auniversal primer binding sequence. In some instances, the universalprimer binding sequence is capable of binding the same 5′ or 3′ primer.In some instances, the universal primer binding sequence is shared amonga plurality of target nucleic acids in the amplification reaction.

Provided herein are methods for single-stranded DNA mediated hierarchalassembly, wherein a reaction for single-stranded DNA mediated hierarchalassembly occurs at an optimal temperature. In some instances, thereaction occurs at a temperature optimal for polymerase activity. Insome instances, the reaction occurs at a temperature optimal forenzymatic activity. In some instances, the reaction occurs at atemperature in a range of about 25-80° C. 25-70° C. 25-60° C. 25-50° C.or 25-40° C. In some instances, the temperature is at least or about 15°C. 20° C. 25° C. 30° C. 35° C. 40° C. 45° C. 50° C. 55° C. 60° C. 65° C.70° C. 75° C. 80° C. or more than 80° C.

Provided herein are methods for single-stranded DNA mediated hierarchalassembly, wherein a gene fragment to be assembled comprises a homologysequence. In some instances, the homology sequence is complementary to ahomology sequence in another gene fragment to be assembled. The homologysequence may comprise a number of base pairs. In some instances, thenumber of base pairs is at least or about 10, 20, 30, 40, 50, 60, 70,80, 90, 100, 125, 150, 175, 200, or more than 200 base pairs. In someinstances, the number of base pairs is 20 base pairs. In some instances,the number of base pairs is 41 base pairs. In some instances the numberof base pairs is 100 base pairs. In some instances, the number of basepairs is 10 to 20, 10 to 30, 10 to 40, 10 to 50, 10 to 60, 10 to 70, 10to 80, 10 to 100, 10 to 125, 10 to 150, 10 to 200, 20 to 30, 20 to 40,20 to 50, 20 to 60, 20 to 70, 20 to 80, 20 to 100, 20 to 125, 20 to 150,20 to 200, 30 to 40, 30 to 50, 30 to 60, 30 to 70, 30 to 80, 30 to 100,30 to 125, 30 to 150, 30 to 200, 40 to 50, 40 to 60, 40 to 70, 40 to 80,40 to 100, 40 to 125, 40 to 150, 40 to 200, 50 to 60, 50 to 70, 50 to80, 50 to 100, 50 to 125, 50 to 150, 50 to 200, 60 to 70, 60 to 80, 60to 100, 60 to 125, 60 to 150, 60 to 200, 70 to 80, 70 to 100, 70 to 125,70 to 150, 70 to 200, 80 to 100, 80 to 125, 80 to 150, 80 to 200, 100 to125, 100 to 150, 100 to 200, 125 to 150, 125 to 200, or 150 to 200 basepairs.

Provided herein are methods for single-stranded DNA mediated hierarchalassembly, wherein a plurality of gene fragments are assembled. In someinstances, the number of gene fragments that are assembled is at leastor about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more than 10 gene fragments.In some instances, the number of gene fragments is 1 to 2, 1 to 3, 1 to4, 1 to 5, 1 to 6, 1 to 7, 1 to 8, 1 to 9, 1 to 10, 2 to 3, 2 to 4, 2 to5, 2 to 6, 2 to 7, 2 to 8, 2 to 9, 2 to 10, 3 to 4, 3 to 5, 3 to 6, 3 to7, 3 to 8, 3 to 9, 3 to 10, 4 to 5, 4 to 6, 4 to 7, 4 to 8, 4 to 9, 4 to10, 5 to 6, 5 to 7, 5 to 8, 5 to 9, 5 to 10, 6 to 7, 6 to 8, 6 to 9, 6to 10, 7 to 8, 7 to 9, 7 to 10, 8 to 9, 8 to 10, or 9 to 10.

Methods described herein for single-stranded DNA mediated hierarchalassembly result in a high percentage of correct assembly. In someinstances, the percentage of correct assembly is at least or about 70%,75%, 80%, 85%, 90%, 95%, 97%, 99%, or more than 99%. In some instances,the percentage of correct assembly is 100%. In some instances, thepercentage of incorrect assembly is at most 5%, 10%, 15%, 20%, 25%, or30%, or more than 30%.

Systems for Synthesis of Nucleic Acids and Seamless Assembly

Polynucleotide Synthesis

Provided herein are methods for seamless assembly of nucleic acidsfollowing generation of polynucleotides by de novo synthesis by methodsdescribed herein. An exemplary workflow is seen in FIG. 9. A computerreadable input file comprising a nucleic acid sequence is received. Acomputer processes the nucleic acid sequence to generate instructionsfor synthesis of the polynucleotide sequence or a plurality ofpolynucleotide sequences collectively encoding the nucleic acidsequence. Instructions are transmitted to a material deposition device903 for synthesis of the plurality of polynucleotides based on theplurality of nucleic acid sequences. The material deposition device 903,such as a polynucleotide acid synthesizer, is designed to releasereagents in a step wise fashion such that multiple polynucleotidesextend, in parallel, one residue at a time to generate oligomers with apredetermined nucleic acid sequence. The material deposition device 903generates oligomers on an array 905 that includes multiple clusters 907of loci for polynucleotide acid synthesis and extension. However, thearray need not have loci organized in clusters. For example, the locican be uniformly spread across the array. De novo polynucleotides aresynthesized and removed from the plate and an assembly reactioncommenced in a collection chamber 909 followed by formation populationof longer polynucleotides 911. The collection chamber may comprise asandwich of multiple surfaces (e.g., a top and bottom surface) or wellor channel in containing transferred material from the synthesissurface. De novo polynucleotides can also be synthesized and removedfrom the plate to form a population of longer polynucleotides 911. Thepopulation of longer polynucleotides 911 can then be partitioned intodroplets or subject to PCR. The population of longer polynucleotides 911is then subject to nucleic acid assembly by either in vitrorecombination cloning 915, or single-stranded DNA hierarchal assembly917.

Provided herein are systems for seamless assembly of nucleic acidsfollowing generation of polynucleotides by de novo synthesis by methodsdescribed herein. In some instances, the system comprises a computer, amaterial deposition device, a surface, and a nucleic acid assemblysurface. In some instances, the computer comprises a readable input filewith a nucleic acid sequence. In some instances, the computer processesthe nucleic acid sequence to generate instructions for synthesis of thepolynucleotide sequence or a plurality of polynucleotide sequencescollectively encoding for the nucleic acid sequence. In some instances,the computer provides instructions to the material deposition device forthe synthesis of the plurality of polynucleotide acid sequences. In someinstances, the material deposition device deposits nucleosides on thesurface for an extension reaction. In some instances, the surfacecomprises a locus for the extension reaction. In some instances, thelocus is a spot, well, microwell, channel, or post. In some instances,the plurality of polynucleotide acid sequences is synthesized followingthe extension reaction. In some instances, the plurality ofpolynucleotide acid sequences are removed from the surface and preparedfor nucleic acid assembly. In some instances, nucleic acid assemblycomprises in vitro recombination cloning. In some instances, nucleicacid assembly comprises single-stranded hierarchal DNA assembly. In someinstances, nucleic acid assembly comprises overlap extension PCR withoutprimer removal.

Provided herein are methods for polynucleotide synthesis involvingphosphoramidite chemistry. In some instances, polynucleotide synthesiscomprises coupling a base with phosphoramidite. In some instances,polynucleotide synthesis comprises coupling a base by deposition ofphosphoramidite under coupling conditions, wherein the same base isoptionally deposited with phosphoramidite more than once, i.e., doublecoupling. In some instances, polynucleotide synthesis comprises cappingof unreacted sites. In some cases, capping is optional. In someinstances, polynucleotide synthesis comprises oxidation. In someinstances, polynucleotide synthesis comprises deblocking ordetritylation. In some instances, polynucleotide synthesis comprisessulfurization. In some cases, polynucleotide synthesis comprises eitheroxidation or sulfurization. In some instances, between one or each stepduring a polynucleotide synthesis reaction, the substrate is washed, forexample, using tetrazole or acetonitrile. Time frames for any one stepin a phosphoramidite synthesis method include less than about 2 min, 1min, 50 sec, 40 sec, 30 sec, 20 sec and 10 sec.

Polynucleotide synthesis using a phosphoramidite method comprises thesubsequent addition of a phosphoramidite building block (e.g.,nucleoside phosphoramidite) to a growing polynucleotide chain for theformation of a phosphite triester linkage. Phosphoramiditepolynucleotide synthesis proceeds in the 3′ to 5′ direction.Phosphoramidite polynucleotide synthesis allows for the controlledaddition of one nucleotide to a growing nucleic acid chain per synthesiscycle. In some instances, each synthesis cycle comprises a couplingstep. Phosphoramidite coupling involves the formation of a phosphitetriester linkage between an activated nucleoside phosphoramidite and anucleoside bound to the substrate, for example, via a linker. In someinstances, the nucleoside phosphoramidite is provided to the substrateactivated. In some instances, the nucleoside phosphoramidite is providedto the substrate with an activator. In some instances, nucleosidephosphoramidites are provided to the substrate in a 1.5, 2, 3, 4, 5, 6,7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 50,60, 70, 80, 90, 100-fold excess or more over the substrate-boundnucleosides. In some instances, the addition of nucleosidephosphoramidite is performed in an anhydrous environment, for example,in anhydrous acetonitrile. Following addition of a nucleosidephosphoramidite, the substrate is optionally washed. In some instances,the coupling step is repeated one or more additional times, optionallywith a wash step between nucleoside phosphoramidite additions to thesubstrate. In some instances, a polynucleotide synthesis method usedherein comprises 1, 2, 3 or more sequential coupling steps. Prior tocoupling, in many cases, the nucleoside bound to the substrate isde-protected by removal of a protecting group, where the protectinggroup functions to prevent polymerization. A common protecting group is4,4′-dimethoxytrityl (DMT).

Following coupling, phosphoramidite polynucleotide synthesis methodsoptionally comprise a capping step. In a capping step, the growingpolynucleotide is treated with a capping agent. A capping step is usefulto block unreacted substrate-bound 5′—OH groups after coupling fromfurther chain elongation, preventing the formation of polynucleotideswith internal base deletions. Further, phosphoramidites activated with1H-tetrazole may react, to a small extent, with the O6 position ofguanosine. Without being bound by theory, upon oxidation with I₂/water,this side product, possibly via O6-N7 migration, may undergodepurination. The apurinic sites may end up being cleaved in the courseof the final deprotection of the polynucleotide thus reducing the yieldof the full-length product. The O6 modifications may be removed bytreatment with the capping reagent prior to oxidation with I₂/water. Insome instances, inclusion of a capping step during polynucleotidesynthesis decreases the error rate as compared to synthesis withoutcapping. As an example, the capping step comprises treating thesubstrate-bound polynucleotide with a mixture of acetic anhydride and1-methylimidazole. Following a capping step, the substrate is optionallywashed.

In some instances, following addition of a nucleoside phosphoramidite,and optionally after capping and one or more wash steps, the substratebound growing nucleic acid is oxidized. The oxidation step comprises thephosphite triester is oxidized into a tetracoordinated phosphatetriester, a protected precursor of the naturally occurring phosphatediester internucleoside linkage. In some cases, oxidation of the growingpolynucleotide is achieved by treatment with iodine and water,optionally in the presence of a weak base (e.g., pyridine, lutidine,collidine). Oxidation may be carried out under anhydrous conditionsusing, e.g. tert-Butyl hydroperoxide or(1S)-(+)-(10-camphorsulfonyl)-oxaziridine (CSO). In some methods, acapping step is performed following oxidation. A second capping stepallows for substrate drying, as residual water from oxidation that maypersist can inhibit subsequent coupling. Following oxidation, thesubstrate and growing polynucleotide is optionally washed. In someinstances, the step of oxidation is substituted with a sulfurizationstep to obtain polynucleotide phosphorothioates, wherein any cappingsteps can be performed after the sulfurization. Many reagents arecapable of the efficient sulfur transfer, including but not limited to3-(Dimethylaminomethylidene)amino)-3H-1,2,4-dithiazole-3-thione, DDTT,3H-1,2-benzodithiol-3-one 1,1-dioxide, also known as Beaucage reagent,and N,N,N′N′-Tetraethylthiuram disulfide (TETD).

In order for a subsequent cycle of nucleoside incorporation to occurthrough coupling, the protected 5′ end of the substrate bound growingpolynucleotide is removed so that the primary hydroxyl group is reactivewith a next nucleoside phosphoramidite. In some instances, theprotecting group is DMT and deblocking occurs with trichloroacetic acidin dichloromethane. Conducting detritylation for an extended time orwith stronger than recommended solutions of acids may lead to increaseddepurination of solid support-bound polynucleotide and thus reduces theyield of the desired full-length product. Methods and compositions ofthe invention described herein provide for controlled deblockingconditions limiting undesired depurination reactions. In some cases, thesubstrate bound polynucleotide is washed after deblocking. In somecases, efficient washing after deblocking contributes to synthesizedpolynucleotides having a low error rate.

Methods for the synthesis of polynucleotides typically involve aniterating sequence of the following steps: application of a protectedmonomer to an actively functionalized surface (e.g., locus) to link witheither the activated surface, a linker or with a previously deprotectedmonomer; deprotection of the applied monomer so that it is reactive witha subsequently applied protected monomer; and application of anotherprotected monomer for linking. One or more intermediate steps includeoxidation or sulfurization. In some cases, one or more wash stepsprecede or follow one or all of the steps.

Methods for phosphoramidite based polynucleotide synthesis comprise aseries of chemical steps. In some instances, one or more steps of asynthesis method involve reagent cycling, where one or more steps of themethod comprise application to the substrate of a reagent useful for thestep. For example, reagents are cycled by a series of liquid depositionand vacuum drying steps. For substrates comprising three-dimensionalfeatures such as wells, microwells, channels and the like, reagents areoptionally passed through one or more regions of the substrate via thewells and/or channels.

Polynucleotides synthesized using the methods and/or substratesdescribed herein comprise at least about 20, 30, 40, 50, 60, 70, 75, 80,90, 100, 120, 150, 200, 500 or more bases in length. In some instances,at least about 1 μmol, 10 pmol, 20 pmol, 30 pmol, 40 pmol, 50 pmol, 60pmol, 70 pmol, 80 pmol, 90 pmol, 100 pmol, 150 pmol, 200 pmol, 300 pmol,400 pmol, 500 pmol, 600 pmol, 700 pmol, 800 pmol, 900 pmol, 1 nmol, 5nmol, 10 nmol, 100 nmol or more of a polynucleotide is synthesizedwithin a locus. Methods for polynucleotide synthesis on a surfaceprovided herein allow for synthesis at a fast rate. As an example, atleast 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 55, 60, 70, 80,90, 100, 125, 150, 175, 200 nucleotides per hour, or more aresynthesized. Nucleotides include adenine, guanine, thymine, cytosine,uridine building blocks, or analogs/modified versions thereof. In someinstances, libraries of polynucleotides are synthesized in parallel on asubstrate. For example, a substrate comprising about or at least about100; 1,000; 10,000; 100,000; 1,000,000; 2,000,000; 3,000,000; 4,000,000;or 5,000,000 resolved loci is able to support the synthesis of at leastthe same number of distinct polynucleotides, wherein polynucleotideencoding a distinct sequence is synthesized on a resolved locus.

Various suitable methods are known for generating high densitypolynucleotide arrays. In an exemplary workflow, a substrate surfacelayer is provided. In the example, chemistry of the surface is alteredin order to improve the polynucleotide synthesis process. Areas of lowsurface energy are generated to repel liquid while areas of high surfaceenergy are generated to attract liquids. The surface itself may be inthe form of a planar surface or contain variations in shape, such asprotrusions or microwells which increase surface area. In the workflowexample, high surface energy molecules selected serve a dual function ofsupporting DNA chemistry, as disclosed in International PatentApplication Publication WO/2015/021080, which is herein incorporated byreference in its entirety.

In situ preparation of polynucleotide arrays is generated on a solidsupport and utilizes single nucleotide extension process to extendmultiple oligomers in parallel. A deposition device, such as apolynucleotide synthesizer, is designed to release reagents in a stepwise fashion such that multiple polynucleotides extend, in parallel, oneresidue at a time to generate oligomers with a predetermined nucleicacid sequence. In some cases, polynucleotides are cleaved from thesurface at this stage. Cleavage includes gas cleavage, e.g., withammonia or methylamine.

Substrates

Devices used as a surface for polynucleotide synthesis may be in theform of substrates which include, without limitation, homogenous arraysurfaces, patterned array surfaces, channels, beads, gels, and the like.Provided herein are substrates comprising a plurality of clusters,wherein each cluster comprises a plurality of loci that support theattachment and synthesis of polynucleotides. The term “locus” as usedherein refers to a discrete region on a structure which provides supportfor polynucleotides encoding for a single predetermined sequence toextend from the surface. In some instances, a locus is on a twodimensional surface, e.g., a substantially planar surface. In someinstances, a locus is on a three-dimensional surface, e.g., a well,microwell, channel, or post. In some instances, a surface of a locuscomprises a material that is actively functionalized to attach to atleast one nucleotide for polynucleotide synthesis, or preferably, apopulation of identical nucleotides for synthesis of a population ofpolynucleotides. In some instances, polynucleotide refers to apopulation of polynucleotides encoding for the same nucleic acidsequence. In some cases, a surface of a substrate is inclusive of one ora plurality of surfaces of a substrate. The average error rates forpolynucleotides synthesized within a library described herein using thesystems and methods provided are often less than 1 in 1000, less thanabout 1 in 2000, less than about 1 in 3000 or less often without errorcorrection.

Provided herein are surfaces that support the parallel synthesis of aplurality of polynucleotides having different predetermined sequences ataddressable locations on a common support. In some instances, asubstrate provides support for the synthesis of more than 50, 100, 200,400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2,000; 5,000; 10,000;20,000; 50,000; 100,000; 200,000; 300,000; 400,000; 500,000; 600,000;700,000; 800,000; 900,000; 1,000,000; 1,200,000; 1,400,000; 1,600,000;1,800,000; 2,000,000; 2,500,000; 3,000,000; 3,500,000; 4,000,000;4,500,000; 5,000,000; 10,000,000 or more non-identical polynucleotides.In some cases, the surfaces provides support for the synthesis of morethan 50, 100, 200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2,000;5,000; 10,000; 20,000; 50,000; 100,000; 200,000; 300,000; 400,000;500,000; 600,000; 700,000; 800,000; 900,000; 1,000,000; 1,200,000;1,400,000; 1,600,000; 1,800,000; 2,000,000; 2,500,000; 3,000,000;3,500,000; 4,000,000; 4,500,000; 5,000,000; 10,000,000 or morepolynucleotides encoding for distinct sequences. In some instances, atleast a portion of the polynucleotides have an identical sequence or areconfigured to be synthesized with an identical sequence. In someinstances, the substrate provides a surface environment for the growthof polynucleotides having at least 80, 90, 100, 120, 150, 175, 200, 225,250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500 bases or more.

Provided herein are methods for polynucleotide synthesis on distinctloci of a substrate, wherein each locus supports the synthesis of apopulation of polynucleotides. In some cases, each locus supports thesynthesis of a population of polynucleotides having a different sequencethan a population of polynucleotides grown on another locus. In someinstances, each polynucleotide sequence is synthesized with 1, 2, 3, 4,5, 6, 7, 8, 9 or more redundancy across different loci within the samecluster of loci on a surface for polynucleotide synthesis. In someinstances, the loci of a substrate are located within a plurality ofclusters. In some instances, a substrate comprises at least 10, 500,1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 11000,12000, 13000, 14000, 15000, 20000, 30000, 40000, 50000 or more clusters.In some instances, a substrate comprises more than 2,000; 5,000; 10,000;100,000; 200,000; 300,000; 400,000; 500,000; 600,000; 700,000; 800,000;900,000; 1,000,000; 1,100,000; 1,200,000; 1,300,000; 1,400,000;1,500,000; 1,600,000; 1,700,000; 1,800,000; 1,900,000; 2,000,000;300,000; 400,000; 500,000; 600,000; 700,000; 800,000; 900,000;1,000,000; 1,200,000; 1,400,000; 1,600,000; 1,800,000; 2,000,000;2,500,000; 3,000,000; 3,500,000; 4,000,000; 4,500,000; 5,000,000; or10,000,000 or more distinct loci. In some instances, a substratecomprises about 10,000 distinct loci. The amount of loci within a singlecluster is varied in different instances. In some cases, each clusterincludes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 120, 130, 150, 200, 300, 400, 500 or more loci. In some instances,each cluster includes about 50-500 loci. In some instances, each clusterincludes about 100-200 loci. In some instances, each cluster includesabout 100-150 loci. In some instances, each cluster includes about 109,121, 130 or 137 loci. In some instances, each cluster includes about 19,20, 61, 64 or more loci.

In some instances, the number of distinct polynucleotides synthesized ona substrate is dependent on the number of distinct loci available in thesubstrate. In some instances, the density of loci within a cluster of asubstrate is at least or about 1, 10, 25, 50, 65, 75, 100, 130, 150,175, 200, 300, 400, 500, 1,000 or more loci per mm². In some cases, asubstrate comprises 10-500, 25-400, 50-500, 100-500, 150-500, 10-250,50-250, 10-200, or 50-200 mm². In some instances, the distance betweenthe centers of two adjacent loci within a cluster is from about 10-500,from about 10-200, or from about 10-100 um. In some instances, thedistance between two centers of adjacent loci is greater than about 10,20, 30, 40, 50, 60, 70, 80, 90 or 100 um. In some instances, thedistance between the centers of two adjacent loci is less than about200, 150, 100, 80, 70, 60, 50, 40, 30, 20 or 10 um. In some instances,each locus has a width of about 0.5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20,30, 40, 50, 60, 70, 80, 90 or 100 um. In some cases, the each locus ishas a width of about 0.5-100, 0.5-50, 10-75, or 0.5-50 um.

In some instances, the density of clusters within a substrate is atleast or about 1 cluster per 100 mm², 1 cluster per 10 mm², 1 clusterper 5 mm², 1 cluster per 4 mm², 1 cluster per 3 mm², 1 cluster per 2mm², 1 cluster per 1 mm², 2 clusters per 1 mm², 3 clusters per 1 mm², 4clusters per 1 mm², 5 clusters per 1 mm², 10 clusters per 1 mm², 50clusters per 1 mm² or more. In some instances, a substrate comprisesfrom about 1 cluster per 10 mm² to about 10 clusters per 1 mm². In someinstances, the distance between the centers of two adjacent clusters isat least or about 50, 100, 200, 500, 1000, 2000, or 5000 um. In somecases, the distance between the centers of two adjacent clusters isbetween about 50-100, 50-200, 50-300, 50-500, and 100-2000 um. In somecases, the distance between the centers of two adjacent clusters isbetween about 0.05-50, 0.05-10, 0.05-5, 0.05-4, 0.05-3, 0.05-2, 0.1-10,0.2-10, 0.3-10, 0.4-10, 0.5-10, 0.5-5, or 0.5-2 mm. In some cases, eachcluster has a cross section of about 0.5 to about 2, about 0.5 to about1, or about 1 to about 2 mm. In some cases, each cluster has a crosssection of about 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.2, 1.3, 1.4, 1.5,1.6, 1.7, 1.8, 1.9 or 2 mm. In some cases, each cluster has an interiorcross section of about 0.5, 0.6, 0.7, 0.8, 0.9, 1, 1.1, 1.15, 1.2, 1.3,1.4, 1.5, 1.6, 1.7, 1.8, 1.9 or 2 mm.

In some instances, a substrate is about the size of a standard 96 wellplate, for example between about 100 and about 200 mm by between about50 and about 150 mm. In some instances, a substrate has a diameter lessthan or equal to about 1000, 500, 450, 400, 300, 250, 200, 150, 100 or50 mm. In some instances, the diameter of a substrate is between about25-1000, 25-800, 25-600, 25-500, 25-400, 25-300, or 25-200 mm. In someinstances, a substrate has a planar surface area of at least about 100;200; 500; 1,000; 2,000; 5,000; 10,000; 12,000; 15,000; 20,000; 30,000;40,000; 50,000 mm² or more. In some instances, the thickness of asubstrate is between about 50-2000, 50-1000, 100-1000, 200-1000, or250-1000 mm.

Surface Materials

Substrates, devices, and reactors provided herein are fabricated fromany variety of materials suitable for the methods, compositions, andsystems described herein. In certain instances, substrate materials arefabricated to exhibit a low level of nucleotide binding. In someinstances, substrate materials are modified to generate distinctsurfaces that exhibit a high level of nucleotide binding. In someinstances, substrate materials are transparent to visible and/or UVlight. In some instances, substrate materials are sufficientlyconductive, e.g., are able to form uniform electric fields across all ora portion of a substrate. In some instances, conductive materials areconnected to an electric ground. In some instances, the substrate isheat conductive or insulated. In some instances, the materials arechemical resistant and heat resistant to support chemical or biochemicalreactions, for example polynucleotide synthesis reaction processes. Insome instances, a substrate comprises flexible materials. For flexiblematerials, materials can include, without limitation: nylon, bothmodified and unmodified, nitrocellulose, polypropylene, and the like. Insome instances, a substrate comprises rigid materials. For rigidmaterials, materials can include, without limitation: glass, fusesilica, silicon, plastics (for example polytetraflouroethylene,polypropylene, polystyrene, polycarbonate, and blends thereof, and thelike), and metals (for example, gold, platinum, and the like). Thesubstrate, solid support or reactors can be fabricated from a materialselected from the group consisting of silicon, polystyrene, agarose,dextran, cellulosic polymers, polyacrylamides, polydimethylsiloxane(PDMS), and glass. The substrates/solid supports or the microstructures,reactors therein may be manufactured with a combination of materialslisted herein or any other suitable material known in the art.

Surface Architecture

Provided herein are substrates for the methods, compositions, andsystems described herein, wherein the substrates have a surfacearchitecture suitable for the methods, compositions, and systemsdescribed herein. In some instances, a substrate comprises raised and/orlowered features. One benefit of having such features is an increase insurface area to support polynucleotide synthesis. In some instances, asubstrate having raised and/or lowered features is referred to as athree-dimensional substrate. In some cases, a three-dimensionalsubstrate comprises one or more channels. In some cases, one or moreloci comprise a channel. In some cases, the channels are accessible toreagent deposition via a deposition device such as a polynucleotidesynthesizer. In some cases, reagents and/or fluids collect in a largerwell in fluid communication one or more channels. For example, asubstrate comprises a plurality of channels corresponding to a pluralityof loci with a cluster, and the plurality of channels are in fluidcommunication with one well of the cluster. In some methods, a libraryof polynucleotides is synthesized in a plurality of loci of a cluster.

Provided herein are substrates for the methods, compositions, andsystems described herein, wherein the substrates are configured forpolynucleotide synthesis. In some instances, the structure is configuredto allow for controlled flow and mass transfer paths for polynucleotidesynthesis on a surface. In some instances, the configuration of asubstrate allows for the controlled and even distribution of masstransfer paths, chemical exposure times, and/or wash efficacy duringpolynucleotide synthesis. In some instances, the configuration of asubstrate allows for increased sweep efficiency, for example byproviding sufficient volume for a growing a polynucleotide such that theexcluded volume by the growing polynucleotide does not take up more than50, 45, 40, 35, 30, 25, 20, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3,2, 1%, or less of the initially available volume that is available orsuitable for growing the polynucleotide. In some instances, athree-dimensional structure allows for managed flow of fluid to allowfor the rapid exchange of chemical exposure.

Provided herein are substrates for the methods, compositions, andsystems described herein, wherein the substrates comprise structuressuitable for the methods, compositions, and systems described herein. Insome instances, segregation is achieved by physical structure. In someinstances, segregation is achieved by differential functionalization ofthe surface generating active and passive regions for polynucleotidesynthesis. In some instances, differential functionalization is achievedby alternating the hydrophobicity across the substrate surface, therebycreating water contact angle effects that cause beading or wetting ofthe deposited reagents. Employing larger structures can decreasesplashing and cross-contamination of distinct polynucleotide synthesislocations with reagents of the neighboring spots. In some cases, adevice, such as a material deposition device, is used to depositreagents to distinct polynucleotide synthesis locations. Substrateshaving three-dimensional features are configured in a manner that allowsfor the synthesis of a large number of polynucleotides (e.g., more thanabout 10,000) with a low error rate (e.g., less than about 1:500,1:1000, 1:1500, 1:2,000; 1:3,000; 1:5,000; or 1:10,000). In some cases,a substrate comprises features with a density of about or greater thanabout 1, 5, 10, 20, 30, 40, 50, 60, 70, 80, 100, 110, 120, 130, 140,150, 160, 170, 180, 190, 200, 300, 400 or 500 features per mm².

A well of a substrate may have the same or different width, height,and/or volume as another well of the substrate. A channel of a substratemay have the same or different width, height, and/or volume as anotherchannel of the substrate. In some instances, the diameter of a clusteror the diameter of a well comprising a cluster, or both, is betweenabout 0.05-50, 0.05-10, 0.05-5, 0.05-4, 0.05-3, 0.05-2, 0.05-1,0.05-0.5, 0.05-0.1, 0.1-10, 0.2-10, 0.3-10, 0.4-10, 0.5-10, 0.5-5, or0.5-2 mm. In some instances, the diameter of a cluster or well or bothis less than or about 5, 4, 3, 2, 1, 0.5, 0.1, 0.09, 0.08, 0.07, 0.06,or 0.05 mm. In some instances, the diameter of a cluster or well or bothis between about 1.0 and about 1.3 mm. In some instances, the diameterof a cluster or well, or both is about 1.150 mm. In some instances, thediameter of a cluster or well, or both is about 0.08 mm. The diameter ofa cluster refers to clusters within a two-dimensional orthree-dimensional substrate.

In some instances, the height of a well is from about 20-1000, 50-1000,100-1000, 200-1000, 300-1000, 400-1000, or 500-1000 um. In some cases,the height of a well is less than about 1000, 900, 800, 700, or 600 um.

In some instances, a substrate comprises a plurality of channelscorresponding to a plurality of loci within a cluster, wherein theheight or depth of a channel is 5-500, 5-400, 5-300, 5-200, 5-100, 5-50,or 10-50 um. In some cases, the height of a channel is less than 100,80, 60, 40, or 20 um.

In some instances, the diameter of a channel, locus (e.g., in asubstantially planar substrate) or both channel and locus (e.g., in athree-dimensional substrate wherein a locus corresponds to a channel) isfrom about 1-1000, 1-500, 1-200, 1-100, 5-100, or 10-100 um, forexample, about 90, 80, 70, 60, 50, 40, 30, 20 or 10 um. In someinstances, the diameter of a channel, locus, or both channel and locusis less than about 100, 90, 80, 70, 60, 50, 40, 30, 20 or 10 um. In someinstances, the distance between the center of two adjacent channels,loci, or channels and loci is from about 1-500, 1-200, 1-100, 5-200,5-100, 5-50, or 5-30, for example, about 20 um.

Surface Modifications

Provided herein are methods for polynucleotide synthesis on a surface,wherein the surface comprises various surface modifications. In someinstances, the surface modifications are employed for the chemicaland/or physical alteration of a surface by an additive or subtractiveprocess to change one or more chemical and/or physical properties of asubstrate surface or a selected site or region of a substrate surface.For example, surface modifications include, without limitation, (1)changing the wetting properties of a surface, (2) functionalizing asurface, i.e., providing, modifying or substituting surface functionalgroups, (3) defunctionalizing a surface, i.e., removing surfacefunctional groups, (4) otherwise altering the chemical composition of asurface, e.g., through etching, (5) increasing or decreasing surfaceroughness, (6) providing a coating on a surface, e.g., a coating thatexhibits wetting properties that are different from the wettingproperties of the surface, and/or (7) depositing particulates on asurface.

In some cases, the addition of a chemical layer on top of a surface(referred to as adhesion promoter) facilitates structured patterning ofloci on a surface of a substrate. Exemplary surfaces for application ofadhesion promotion include, without limitation, glass, silicon, silicondioxide, and silicon nitride. In some cases, the adhesion promoter is achemical with a high surface energy. In some instances, a secondchemical layer is deposited on a surface of a substrate. In some cases,the second chemical layer has a low surface energy. In some cases,surface energy of a chemical layer coated on a surface supportslocalization of droplets on the surface. Depending on the patterningarrangement selected, the proximity of loci and/or area of fluid contactat the loci are alterable.

In some instances, a substrate surface, or resolved loci, onto whichnucleic acids or other moieties are deposited, e.g., for polynucleotidesynthesis, are smooth or substantially planar (e.g., two-dimensional) orhave irregularities, such as raised or lowered features (e.g.,three-dimensional features). In some instances, a substrate surface ismodified with one or more different layers of compounds. Suchmodification layers of interest include, without limitation, inorganicand organic layers such as metals, metal oxides, polymers, small organicmolecules and the like.

In some instances, resolved loci of a substrate are functionalized withone or more moieties that increase and/or decrease surface energy. Insome cases, a moiety is chemically inert. In some cases, a moiety isconfigured to support a desired chemical reaction, for example, one ormore processes in a polynucleotide acid synthesis reaction. The surfaceenergy, or hydrophobicity, of a surface is a factor for determining theaffinity of a nucleotide to attach onto the surface. In some instances,a method for substrate functionalization comprises: (a) providing asubstrate having a surface that comprises silicon dioxide; and (b)silanizing the surface using, a suitable silanizing agent describedherein or otherwise known in the art, for example, an organofunctionalalkoxysilane molecule. Methods and functionalizing agents are describedin U.S. Pat. No. 5,474,796, which is herein incorporated by reference inits entirety.

In some instances, a substrate surface is functionalized by contact witha derivatizing composition that contains a mixture of silanes, underreaction conditions effective to couple the silanes to the substratesurface, typically via reactive hydrophilic moieties present on thesubstrate surface. Silanization generally covers a surface throughself-assembly with organofunctional alkoxysilane molecules. A variety ofsiloxane functionalizing reagents can further be used as currently knownin the art, e.g., for lowering or increasing surface energy. Theorganofunctional alkoxysilanes are classified according to their organicfunctions.

Computer Systems

Any of the systems described herein, may be operably linked to acomputer and may be automated through a computer either locally orremotely. In some instances, the methods and systems of the inventionfurther comprise software programs on computer systems and use thereof.Accordingly, computerized control for the synchronization of thedispense/vacuum/refill functions such as orchestrating and synchronizingthe material deposition device movement, dispense action and vacuumactuation are within the bounds of the invention. The computer systemsmay be programmed to interface between the user specified base sequenceand the position of a material deposition device to deliver the correctreagents to specified regions of the substrate.

The computer system 1000 illustrated in FIG. 10 may be understood as alogical apparatus that can read instructions from media 1011 and/or anetwork port 1005, which can optionally be connected to server 1009having fixed media 1012. The system, such as shown in FIG. 10 caninclude a CPU 1001, disk drives 1003, optional input devices such askeyboard 1015 and/or mouse 1016 and optional monitor 1007. Datacommunication can be achieved through the indicated communication mediumto a server at a local or a remote location. The communication mediumcan include any means of transmitting and/or receiving data. Forexample, the communication medium can be a network connection, awireless connection or an internet connection. Such a connection canprovide for communication over the World Wide Web. It is envisioned thatdata relating to the present disclosure can be transmitted over suchnetworks or connections for reception and/or review by a party 1022 asillustrated in FIG. 10.

FIG. 11 is a block diagram illustrating architecture of a computersystem 1100 that can be used in connection with example embodiments ofthe present invention. As depicted in FIG. 11, the example computersystem can include a processor 1102 for processing instructions.Non-limiting examples of processors include: Intel® Xeon® processor, AMDOpteron™ processor, Samsung 32-bit RISC ARM 1176JZ(F)-S v1.0 processor,ARM Cortex-A8 Samsung S5PC100 processor, ARM Cortex-A8 Apple A4processor, Marvell PXA 930 processor, or a functionally-equivalentprocessor. Multiple threads of execution can be used for parallelprocessing. In some instances, multiple processors or processors withmultiple cores can also be used, whether in a single computer system, ina cluster, or distributed across systems over a network comprising aplurality of computers, cell phones, and/or personal data assistantdevices.

As illustrated in FIG. 11, a high speed cache 1104 can be connected to,or incorporated in, the processor 1102 to provide a high speed memoryfor instructions or data that have been recently, or are frequently,used by processor 1102. The processor 1102 is connected to a northbridge 1106 by a processor bus 1108. The north bridge 1106 is connectedto random access memory (RAM) 1110 by a memory bus 1112 and managesaccess to the RAM 1110 by the processor 1102. The north bridge 1106 isalso connected to a south bridge 1114 by a chipset bus 1116. The southbridge 1114 is, in turn, connected to a peripheral bus 1118. Theperipheral bus can be, for example, PCI, PCI-X, PCI Express, or otherperipheral bus. The north bridge and south bridge are often referred toas a processor chipset and manage data transfer between the processor,RAM, and peripheral components on the peripheral bus 1118. In somealternative architectures, the functionality of the north bridge can beincorporated into the processor instead of using a separate north bridgechip. In some instances, system 1100 can include an accelerator card1122 attached to the peripheral bus 1118. The accelerator can includefield programmable gate arrays (FPGAs) or other hardware foraccelerating certain processing. For example, an accelerator can be usedfor adaptive data restructuring or to evaluate algebraic expressionsused in extended set processing.

Software and data are stored in external storage 1124 and can be loadedinto RAM 1110 and/or cache 1104 for use by the processor. The system1100 includes an operating system for managing system resources;non-limiting examples of operating systems include: Linux, Windows™,MACOS™, BlackBerry OS™, iOS™, and other functionally-equivalentoperating systems, as well as application software running on top of theoperating system for managing data storage and optimization inaccordance with example embodiments of the present invention. In thisexample, system 1100 also includes network interface cards (NICs) 1120and 1121 connected to the peripheral bus for providing networkinterfaces to external storage, such as Network Attached Storage (NAS)and other computer systems that can be used for distributed parallelprocessing.

FIG. 12 is a block diagram of a multiprocessor computer system using ashared virtual address memory space in accordance with an exampleembodiment. The system includes a plurality of processors 1202 a-f thatcan access a shared memory subsystem 1204. The system incorporates aplurality of programmable hardware memory algorithm processors (MAPs)1206 a-f in the memory subsystem 1204. Each MAP 1206 a-f can comprise amemory 1208 a-f and one or more field programmable gate arrays (FPGAs)1210 a-f The MAP provides a configurable functional unit and particularalgorithms or portions of algorithms can be provided to the FPGAs 1210a-f for processing in close coordination with a respective processor.For example, the MAPs can be used to evaluate algebraic expressionsregarding the data model and to perform adaptive data restructuring inexample embodiments. In this example, each MAP is globally accessible byall of the processors for these purposes. In one configuration, each MAPcan use Direct Memory Access (DMA) to access an associated memory 1208a-f, allowing it to execute tasks independently of, and asynchronouslyfrom, the respective microprocessor 1202 a-f In this configuration, aMAP can feed results directly to another MAP for pipelining and parallelexecution of algorithms.

FIG. 13 is a diagram showing a network with a plurality of computersystems 1302 a and 1302 b, a plurality of cell phones and personal dataassistants 1302 c, and Network Attached Storage (NAS) 1304 a and 1304 b.In example embodiments, systems 1302 a, 1302 b, and 1302 c can managedata storage and optimize data access for data stored in NetworkAttached Storage (NAS) 1304 a and 1304 b. A mathematical model can beused for the data and be evaluated using distributed parallel processingacross computer systems 1302 a, and 1302 b, and cell phone and personaldata assistant systems 1302 c. Computer systems 1302 a, and 1302 b, andcell phone and personal data assistant systems 1302 c can also provideparallel processing for adaptive data restructuring of the data storedin Network Attached Storage (NAS) 1304 a and 1304 b. FIG. 13 illustratesan example only, and a wide variety of other computer architectures andsystems can be used in conjunction with the various embodiments of thepresent invention. For example, a blade server can be used to provideparallel processing. Processor blades can be connected through a backplane to provide parallel processing. Storage can also be connected tothe back plane or as Network Attached Storage (NAS) through a separatenetwork interface. In some instances, processors can maintain separatememory spaces and transmit data through network interfaces, back planeor other connectors for parallel processing by other processors. In someinstances, some or all of the processors can use a shared virtualaddress memory space.

Any of the systems described herein may comprise sequence informationstored on non-transitory computer readable storage media. In someinstances, any of the systems described herein comprise a computer inputfile. In some instances, the computer input file comprises sequenceinformation. In some instances, the computer input file comprisesinstructions for synthesis of a plurality of polynucleotide sequence. Insome instances, the instructions are received by a computer. In someinstances, the instructions are processed by the computer. In someinstances, the instructions are transmitted to a material depositiondevice. In some instances, the non-transitory computer readable storagemedia is encoded with a program including instructions executable by theoperating system of an optionally networked digital processing device.In some instances, a computer readable storage medium is a tangiblecomponent of a digital processing device. In some instances, a computerreadable storage medium is optionally removable from a digitalprocessing device. In some instances, a computer readable storage mediumincludes, by way of non-limiting examples, CD-ROMs, DVDs, flash memorydevices, solid state memory, magnetic disk drives, magnetic tape drives,optical disk drives, cloud computing systems and services, and the like.In some instances, the program and instructions are permanently,substantially permanently, semi-permanently, or non-transitorily encodedon the media.

EXAMPLES

The following examples are given for the purpose of illustrating variousembodiments of the invention and are not meant to limit the presentinvention in any fashion. The present examples, along with the methodsdescribed herein are presently representative of preferred embodiments,are exemplary, and are not intended as limitations on the scope of theinvention. Changes therein and other uses which are encompassed withinthe spirit of the invention as defined by the scope of the claims willoccur to those skilled in the art.

Example 1: Functionalization of a Substrate Surface

A substrate was functionalized to support the attachment and synthesisof a library of polynucleotides. The substrate surface was first wetcleaned using a piranha solution comprising 90% H₂SO₄ and 10% H₂O₂ for20 minutes. The substrate was rinsed in several beakers with deionizedwater, held under a deionized water gooseneck faucet for 5 min, anddried with N₂. The substrate was subsequently soaked in NH₄OH (1:100; 3mL:300 mL) for 5 min, rinsed with DI water using a handgun, soaked inthree successive beakers with deionized water for 1 min each, and thenrinsed again with deionized water using the handgun. The substrate wasthen plasma cleaned by exposing the substrate surface to O₂. A SAMCOPC-300 instrument was used to plasma etch O₂ at 250 watts for 1 min indownstream mode.

The cleaned substrate surface was actively functionalized with asolution comprising N-(3-triethoxysilylpropyl)-4-hydroxybutyramide usinga YES-1224P vapor deposition oven system with the following parameters:0.5 to 1 torr, 60 min, 70° C. 135° C. vaporizer. The substrate surfacewas resist coated using a Brewer Science 200X spin coater. SPR™ 3612photoresist was spin coated on the substrate at 2500 rpm for 40 sec. Thesubstrate was pre-baked for 30 min at 90° C. on a Brewer hot plate. Thesubstrate was subjected to photolithography using a Karl Suss MA6 maskaligner instrument. The substrate was exposed for 2.2 sec and developedfor 1 min in MSF 26A. Remaining developer was rinsed with the handgunand the substrate soaked in water for 5 min. The substrate was baked for30 min at 100° C. in the oven, followed by visual inspection forlithography defects using a Nikon L200. A cleaning process was used toremove residual resist using the SAMCO PC-300 instrument to O₂ plasmaetch at 250 watts for 1 min.

The substrate surface was passively functionalized with a 100 μLsolution of perfluorooctyltrichlorosilane mixed with 10 μL light mineraloil. The substrate was placed in a chamber, pumped for 10 min, and thenthe valve was closed to the pump and left to stand for 10 min. Thechamber was vented to air. The substrate was resist stripped byperforming two soaks for 5 min in 500 mL NMP at 70° C. withultrasonication at maximum power (9 on Crest system). The substrate wasthen soaked for 5 min in 500 mL isopropanol at room temperature withultrasonication at maximum power. The substrate was dipped in 300 mL of200 proof ethanol and blown dry with N₂. The functionalized surface wasactivated to serve as a support for polynucleotide synthesis.

Example 2: Synthesis of a 50-Mer Sequence on an OligonucleotideSynthesis Device

A two dimensional oligonucleotide synthesis device was assembled into aflowcell, which was connected to a flowcell (Applied Biosystems (“ABI394DNA Synthesizer”)). The two-dimensional oligonucleotide synthesis devicewas uniformly functionalized withN-(3-TRIETHOXYSILYLPROPYL)-4-HYDROXYBUTYRAMIDE (Gelest) was used tosynthesize an exemplary polynucleotide of 50 bp (“50-merpolynucleotide”) using polynucleotide synthesis methods describedherein.

The sequence of the 50-mer was as described in SEQ ID NO.: 1.5′AGACAATCAACCATTTGGGGTGGACAGCCTTGACCTCTAGACTTCGGCAT##TTTTTTT TTT3′ (SEQID NO.: 1), where # denotes Thymidine-succinyl hexamide CEDphosphoramidite (CLP-2244 from ChemGenes), which is a cleavable linkerenabling the release of polynucleotides from the surface duringdeprotection.

The synthesis was done using standard DNA synthesis chemistry (coupling,capping, oxidation, and deblocking) according to the protocol in Table 1and ABI394 DNA Synthesizer.

TABLE 1 Synthesis Protocol General DNA Synthesis Table 1 Process NameProcess Step Time (sec) WASH (Acetonitrile Wash Acetonitrile SystemFlush 4 Flow) Acetonitrile to Flowcell 23 N2 System Flush 4 AcetonitrileSystem Flush 4 DNA BASE ADDITION Activator Manifold Flush 2(Phosphoramidite + Activator to Flowcell 6 Activator Flow) Activator + 6Phosphoramidite to Flowcell Activator to Flowcell 0.5 Activator + 5Phosphoramidite to Flowcell Activator to Flowcell 0.5 Activator + 5Phosphoramidite to Flowcell Activator to Flowcell 0.5 Activator + 5Phosphoramidite to Flowcell Incubate for 25 sec 25 WASH (AcetonitrileWash Acetonitrile System Flush 4 Flow) Acetonitrile to Flowcell 15 N2System Flush 4 Acetonitrile System Flush 4 DNA BASE ADDITION ActivatorManifold Flush 2 (Phosphoramidite + Activator to Flowcell 5 ActivatorFlow) Activator + 18 Phosphoramidite to Flowcell Incubate for 25 sec 25WASH (Acetonitrile Wash Acetonitrile System Flush 4 Flow) Acetonitrileto Flowcell 15 N2 System Flush 4 Acetonitrile System Flush 4 CAPPING(CapA + B, 1:1, CapA + B to Flowcell 15 Flow) WASH (Acetonitrile WashAcetonitrile System Flush 4 Flow) Acetonitrile to Flowcell 15Acetonitrile System Flush 4 OXIDATION (Oxidizer Oxidizer to Flowcell 18Flow) WASH (Acetonitrile Wash Acetonitrile System Flush 4 Flow) N2System Flush 4 Acetonitrile System Flush 4 Acetonitrile to Flowcell 15Acetonitrile System Flush 4 Acetonitrile to Flowcell 15 N2 System Flush4 Acetonitrile System Flush 4 Acetonitrile to Flowcell 23 N2 SystemFlush 4 Acetonitrile System Flush 4 DEBLOCKING (Deblock Deblock toFlowcell 36 Flow) WASH (Acetonitrile Wash Acetonitrile System Flush 4Flow) N2 System Flush 4 Acetonitrile System Flush 4 Acetonitrile toFlowcell 18 N2 System Flush 4.13 Acetonitrile System Flush 4.13Acetonitrile to Flowcell 15

The phosphoramidite/activator combination was delivered similar to thedelivery of bulk reagents through the flowcell. No drying steps wereperformed as the environment stays “wet” with reagent the entire time.

The flow restrictor was removed from the ABI394 DNA Synthesizer toenable faster flow. Without flow restrictor, flow rates for amidites(0.1M in ACN), Activator, (0.25M Benzoylthiotetrazole (“BTT”; 30-3070-xxfrom GlenResearch) in ACN), and Ox (0.02M 12 in 20% pyridine, 10% water,and 70% THF) were roughly ˜100 uL/sec, for acetonitrile (“ACN”) andcapping reagents (1:1 mix of CapA and CapB, wherein CapA is aceticanhydride in THF/Pyridine and CapB is 16% 1-methylimidizole in THF),roughly ˜200 uL/sec, and for Deblock (3% dichloroacetic acid intoluene), roughly ˜300 uL/sec (compared to ˜50 uL/sec for all reagentswith flow restrictor). The time to completely push out Oxidizer wasobserved, the timing for chemical flow times was adjusted accordinglyand an extra ACN wash was introduced between different chemicals. Afterpolynucleotide synthesis, the chip was deprotected in gaseous ammoniaovernight at 75 psi. Five drops of water were applied to the surface torecover polynucleotides. The recovered polynucleotides were thenanalyzed on a BioAnalyzer small RNA chip (data not shown).

Example 3: Synthesis of a 100-Mer Sequence on an OligonucleotideSynthesis Device

The same process as described in Example 2 for the synthesis of the50-mer sequence was used for the synthesis of a 100-mer polynucleotide(“100-mer polynucleotide”; 5′CGGGATCCTTATCGTCATCGTCGTACAGATCCCGACCCATTTGCTGTCCACCAGTCATGCTAGCCATACCATGATGATGATGATGATGAGAACCCCGCAT##TTTTTTTTTT3′, where # denotesThymidine-succinyl hexamide CED phosphoramidite (CLP-2244 fromChemGenes); SEQ ID NO.: 2) on two different silicon chips, the first oneuniformly functionalized withN-(3-TRIETHOXYSILYLPROPYL)-4-HYDROXYBUTYRAMIDE and the second onefunctionalized with 5/95 mix of 11-acetoxyundecyltriethoxysilane andn-decyltriethoxysilane, and the polynucleotides extracted from thesurface were analyzed on a BioAnalyzer instrument (data not shown).

All ten samples from the two chips were further PCR amplified using aforward (5′ATGCGGGGTTCTCATCATC3′; SEQ ID NO.: 3) and a reverse(5′CGGGATCCTTATCGTCATCG3′; SEQ ID NO.: 4) primer in a 50 uL PCR mix (25uL NEB Q5 mastermix, 2.5 uL 10 uM Forward primer, 2.5 uL 10 uM Reverseprimer, 1 uL polynucleotide extracted from the surface, and water up to50 uL) using the following thermalcycling program:

98° C. 30 sec

98° C. 10 sec; 63° C. 10 sec; 72° C. 10 sec; repeat 12 cycles

72° C. 2 min

The PCR products were also run on a BioAnalyzer (data not shown),demonstrating sharp peaks at the 100-mer position. Next, the PCRamplified samples were cloned, and Sanger sequenced. Table 2 summarizesthe results from the Sanger sequencing for samples taken from spots 1-5from chip 1 and for samples taken from spots 6-10 from chip 2.

TABLE 2 Sequencing Results Spot Error rate Cycle efficiency 1 1/763 bp99.87% 2 1/824 bp 99.88% 3 1/780 bp 99.87% 4 1/429 bp 99.77% 5 1/1525 bp99.93% 6 1/1615 bp 99.94% 7 1/531 bp 99.81% 8 1/1769 bp 99.94% 9 1/854bp 99.88% 10 1/1451 bp 99.93%

Thus, the high quality and uniformity of the synthesized polynucleotideswere repeated on two chips with different surface chemistries. Overall,89%, corresponding to 233 out of 262 of the 100-mers that were sequencedwere perfect sequences with no errors.

Finally, Table 3 summarizes key error characteristics for the sequencesobtained from the polynucleotides samples from spots 1-10.

TABLE 3 Error Characteristics Sample ID/ OSA_ OSA_ OSA_ OSA_ OSA_ OSA_OSA_ OSA_ OSA_ OSA_ Spot no. 0046/1 0047/2 0048/3 0049/4 0050/5 0051/60052/7 0053/8 0054/9 0055/10 Total Sequences  32  32  32  32  32  32  32 32  32  32 Sequencing 25 of 28 27 of 27 26 of 30 21 of 23 25 of 26 29of 30 27 of 31 29 of 31 28 of 29 25 of 28 Quality Oligo Quality 23 of 2525 of 27 22 of 26 18 of 21 24 of 25 25 of 29 22 of 27 28 of 29 26 of 2820 of 25 ROI Match 2500 2698 2561 2122 2499 2666 2625 2899 2798 2348Count ROI Mutation   2   2   1   3   1   0   2   1   2   1 ROI Multi   0  0   0   0   0   0   0   0   0   0 Base Deletion ROI Small   1   0   0  0   0   0   0   0   0   0 Insertion ROI Single   0   0   0   0   0   0  0   0   0   0 Base Deletion Large Deletion   0   0   1   0   0   1   1  0   0   0 Count Mutation: G > A   2   2   1   2   1   0   2   1   2  1 Mutation: T > C   0   0   0   1   0   0   0   0   0   0 ROI ErrorCount   3   2   2   3   1   1   3   1   2   1 ROI Error Rate Err: ~1Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1in 834 in 1350 in 1282 in 708 in 2500 in 2667 in 876 in 2900 in 1400 in2349 ROI Minus MP MP MP MP MP MP MP MP MP MP Primer Error Err: ~1 Err:~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Err: ~1 Ratein 763 in 824 in 780 in 429 in 1525 in 1615 in 531 in 1769 in 854 in1451

Example 4. Overlap Extension PCR without Primer Removal

Two genes (Gene 1 and Gene 2) were selected to perform overlap extensionPCR without primer removal. Varying lengths of base pair overlap weretested for Gene 1 and Gene 2, including 20, 41, and 100 base pairs. PCRamplification was performed using universal primers. Referring to FIG.14, incorrect assembly (white) and correct assembly (black) weredetermined for Gene 1 and Gene 2 amplified with universal primers andhaving a base pair overlap of 20, 41, and 100 base pairs. Assembly ofGene 1 and Gene 2 without amplification by universal primers with a 20base pair overlap was also determined. As seen in FIG. 14, universalprimers having a base pair overlap of 41 base pairs resulted in improvedassembly for Gene 1 and Gene 2. For Gene 1, assembly was about 91%.

Example 5. Single-Stranded DNA Mediated Hierarchal Assembly of Two GeneFragments

A gene was assembled into a 3 kb gene from two gene fragments. The 5′and 3′ of each fragment end was appended with uracils and amplifiedusing a uracil compatible polymerase such as KapaU polymerase orPhusionU polymerase. Each of the two fragments further comprised ahomology sequence. See FIG. 1. The two fragments were then mixed andamplified with universal primers and Q5 DNA polymerase. Q5 DNApolymerase is incompatible with uracil and resulted in stalling at auracil base. Single stranded fragments that do not comprise uracil werethus generated. The two fragments were then combined and amplified togenerate the 3 kb gene.

Example 6. Single-Stranded DNA Mediated Hierarchal Assembly of ThreeGene Fragments

A gene was assembled into a 3 kb gene from three gene fragments. The 5′and 3′ of each fragment end was appended with flanking adapter sequencescomprising uracils and amplified using a uracil compatible polymerasesuch as KapaU polymerase or PhusionU polymerase. Each of the threefragments comprised a homology sequence. See FIG. 2. The three fragmentswere mixed and amplified with Q5 DNA polymerase. Q5 DNA polymerase isincompatible with uracil and resulted in stalling at a uracil base.Single stranded fragments that do not comprise uracil were thusgenerated. Two of the three fragments were then combined and amplifiedto generate a single fragment. The third fragment was combined with thesynthesized fragment and amplified to generate the 3 kb gene.

Example 7. Single-Stranded DNA Mediated Hierarchal Assembly with VaryingBase Pair Overlap Sequence Length

Two genes (Gene 1 and Gene 2) were assembled similar to Examples 5-6.The 5′ and 3′ of each fragment end was appended with flanking adaptersequences comprising uracils and amplified using a uracil compatiblepolymerase such as KapaU polymerase or PhusionU polymerase. Each of thetwo fragments comprised a homology sequence. The length of the homologysequence was 20, 41, or 100 base pairs. The two fragments were mixed andamplified with universal primers and Q5 DNA polymerase or KapaHiFipolymerase. Q5 DNA polymerase and KapaHiFi polymerase are incompatiblewith uracil and resulted in stalling at a uracil base. Single strandedfragments that do not comprise uracil were thus generated. The twofragments are combined and amplified to generate Gene 1 and Gene 2.

Incorrect assembly (white) and correct assembly (black) were determinedfor Gene 1 and Gene 2 amplified with Q5 DNA polymerase and KapaHiFipolymerase and having a base pair overlap of 20, 41, and 100 base pair(FIG. 15). The 20, 41, and 100 base pair overlap resulted in over 70%correct assembly with both Q5 DNA polymerase and KapaHiFi polymerase(FIG. 15). The 41 base pair overlap resulted in further improvedassembly at 100% of correct assembly with both Q5 DNA polymerase andKapaHiFi polymerase (FIG. 15).

Example 8. In Vitro Recombination Cloning

In Vitro Recombination (IVTR) Lysate Preparation

JM109 cell lysate was prepared. JM109 was streaked on a M9 agar plateand incubated at 37° C. for about 24 hours. Colonies were picked andincubated in 15 mL tubes comprising 3 mL of M9 broth for a seed culture.The tubes were shaken at 250 rpm at 37° C. for about 19 hours. WhenOD600 was between about 2-3, 0.016 mL of the seed culture was removedand added to flasks comprising 100 mL of TB broth. The flasks wereshaken at 300 rpm at 37° C. for about 4.5 hours. The cells were spundown at 4,816×g for 10 minutes and washed with 100 mL of ice-cold water.The cells were lysed with 2.4 mL CelLytic B Cell Lysis Reagent(Sigma-Aldrich) for 10 minutes at room-temperature and centrifuged at18,615×g for 5 minutes to collect the supernatant (about 2 mL). Thesupernatant was mixed with 2 mL of ice cold 80% glycerol and snap frozenin a dry ice-ethanol bath followed by storage at −80° C.

IVTR Cloning

A gene fragment comprised from 5′ to 3′: a 5′ flanking adapter sequence,a first homology sequence, an insert sequence, a second homologysequence, and a 3′ flanking adapter sequence. The gene fragment and avector comprising homologous sequences to the first homology sequenceand the second homology sequence were incubated with JM109 cell lysate.See FIG. 4. A reaction was set up similar to Table 4. Varying amounts ofinsert were tested: 13 fmol (white bars), 26 fmol (hashed bars), and 40fmol (black bars) (FIG. 16). The ratio of insert:vector (X-axis) wasalso varied including ratios of 1:1, 2:1, and 4:1.

Tens of thousands colonies were retrieved. Colony forming unit (CFU,Y-axis) was then determined as a measure of efficiency. More than 99%correct assembly of any selected condition (1 misassembly out of 437valid clones) was observed (FIG. 16). Referring to FIG. 17, colony PCRresults of 48 colonies showed only 4 samples failed, which werebackground plasmids.

TABLE 4 Reaction Conditions Final Concentration 10X IVTR buffer 1X JM109VS208 ASP 4 nM Fragment 1 4 nM Fragment 2 4 nM

Example 9. In Vitro Recombination Cloning with Two Fragments

A first gene fragment comprised from 5′ to 3′: a 5′ flanking adaptersequence, a first homology sequence, an insert sequence, a secondhomology sequence, and a 3′ flanking adapter sequence. A second genefragment comprised from 5′ to 3′: a 5′ flanking adapter sequence, afirst homology sequence, an insert sequence, a second homology sequence,and a 3′ flanking adapter sequence. The first homology sequence of thefirst gene fragment was homologous to a sequence on the vector. See FIG.4. The first gene fragment, second gene fragment, and vector wereincubated with JM109 cell lysate.

The effects of the length of the homology sequence (20, 41, or 100 basepairs) and location of the homology sequence on colony forming unit(CFU, Y-axis) were determined (FIG. 18). Data is summarized in Table 5.The homology sequence was either flanked by the universal primers(internal) or at the 5′ or 3′ end of the gene fragment (terminal). Therewas increased CFUs with a homology sequence of 41 base pairs. A terminallocation of the homology sequence of 41 base pairs resulted in a greaternumber of CFUs as compared to an internal location. A terminal locationof the homology sequence of 100 base pairs also resulted in a greaternumber of CFUs as compared to an internal location.

Referring to FIG. 19, the percentage of misassembly (white bars) andassembly (black bars) was determined. A terminal location and homologysequence length of 20, 41, and 100 base pairs resulted in 100% assembly.An internal location and homology sequence length of 20 and 41 basepairs also resulted in 100% assembly.

TABLE 5 Colony Forming Unit Results Homology Sequence Length (number ofbase pairs) Terminal or Internal CFU 20 Terminal 190 20 Internal 200 41Terminal 1280 41 Internal 720 100 Terminal 1030 100 Internal 190

Example 10. In Vitro Recombination Cloning with Varying Lengths ofNon-Homologous Sequences

The effect of varying lengths of the non-homologous sequence wasdetermined. Four fragments were designed: V1, V2, V3, and V4 (FIG. 5).V1 comprised an overlapping sequence at the terminal end. V2 comprisedan internal overlapping sequence followed by at a 24 base pair sequenceat the 3′ end. V3 comprised an overlapping sequence followed by a 124base pair sequence at the end 3′ end. V4 comprised an overlappingsequence followed by a 324 base pair at the 3′ end.

Percentage of assembly (FIG. 20A) and total CFU (FIG. 20B) weredetermined for sequences comprising 0 (V1), 24 (V2), 124 (V3), and 324(V4) base pairs at the 3′ end. Referring to FIG. 20A, 100% correctassembly was observed for sequences comprising 0, 24, 124, and 324 basepairs at the 3′ end. Referring to FIG. 20B, sequences comprising 0 and24 base pairs at the 3′ end resulted in increased CFUs.

Example 11. In Vitro Recombination Cloning with Varying InternalHomology

The effect of internal homology was determined. Three fragments weredesigned: V5, V6, and V7 (FIG. 6A). V5 comprised a 24 base pair sequencebetween two overlapping sequences. V6 comprised a 124 base pair sequencebetween two overlapping sequences. V7 comprised a 324 base pair sequencebetween two overlapping sequences.

Percentage of assembly (FIG. 21A) and Total CFU (FIG. 21B) weredetermined for internal sequences comprising 24 (V5), 124 (V6), and 324(V7) base pairs. Referring to FIG. 21A, more than 80% correct assemblywas observed for internal sequences comprising 124 and 324 base pairs.Referring to FIG. 21B, internal sequences comprising 124 and 324 basepairs also resulted in increased CFUs.

Example 12. Single-Stranded DNA Mediated Hierarchal Assembly of MultipleGene Fragments

A gene is assembled into a 3 kb gene from five gene fragments. The 5′and 3′ of each fragment end is appended with flanking adapter sequencescomprising uracils. Each of the five fragments comprises a homologysequence. The five fragments are mixed and amplified with universalprimers and Q5 DNA polymerase. Following amplification using DNApolymerase, single stranded fragments that do not comprise uracil aregenerated. Two of the five fragments are combined and amplified with DNApolymerase to generate a single fragment. A third fragment is combinedwith the synthesized fragment and amplified to generate a singlefragment. A fourth fragment is combined with the synthesized fragmentand amplified to generate a single fragment. A fifth fragment iscombined with the synthesized fragment and amplified to generate theassembled 3 kb gene.

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein may be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand structures within the scope of these claims and their equivalents becovered thereby.

What we claim is:
 1. A method for nucleic acid assembly, comprising: (a)providing at least one double stranded nucleic acid comprising in 5′ to3′ order: a 5′ flanking adapter sequence, a first homology sequence, aninsert sequence, a second homology sequence, and a 3′ flanking adaptersequence, wherein the first homology sequence and the second homologysequence each comprises about 20 to about 100 base pairs in length; (b)providing a vector comprising the first homology sequence and the secondhomology sequence; and (c) mixing the at least one double strandednucleic acid and the vector with a bacterial lysate.
 2. The method ofclaim 1, wherein the bacterial lysate comprises a nuclease or arecombinase.
 3. The method of claim 1, wherein the bacterial lysatecomprises a nuclease and a recombinase.
 4. The method of claim 1,wherein the first homology sequence and the second homology sequenceeach comprises about 20 base pairs.
 5. The method of claim 1, whereinthe first homology sequence and the second homology sequence eachcomprises about 41 base pairs.
 6. The method of claim 1, wherein thefirst homology sequence and the second homology sequence each comprisesabout 100 base pairs.
 7. The method of claim 1, wherein the firsthomology sequence or the second homology sequence is flanked by the 5′flanking adapter sequence and the 3′ flanking adapter sequence.
 8. Themethod of claim 1, wherein a percentage of correct assembly is at least65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99%.
 9. A method for nucleic acidsynthesis and assembly, comprising: (a) de novo synthesizing a pluralityof polynucleotides, wherein each polynucleotide comprises a firsthomology region that comprises in 5′ to 3′ order: a 5′ flanking adaptersequence, a first homology sequence, an insert sequence, a secondhomology sequence, and a 3′ flanking adapter sequence, wherein the firsthomology sequence and the second homology sequence each comprises about20 to about 100 base pairs in length, and wherein each polynucleotidecomprises a homology sequence identical to that of anotherpolynucleotide of the plurality of polynucleotides; and (b) mixing ofthe plurality of polynucleotides with a bacterial lysate to processivelyform nucleic acids each having a predetermined sequence.
 10. The methodof claim 9, wherein the bacterial lysate comprises a nuclease or arecombinase.
 11. The method of claim 9, wherein the bacterial lysatecomprises a nuclease and a recombinase.
 12. The method of claim 9,wherein the first homology sequence and the second homology sequenceeach comprises about 20 base pairs.
 13. The method of claim 9, whereinthe first homology sequence and the second homology sequence eachcomprises about 41 base pairs.
 14. The method of claim 9, wherein thefirst homology sequence and the second homology sequence each comprisesabout 100 base pairs.
 15. The method of claim 9, wherein the firsthomology sequence or the second homology sequence is flanked by the 5′flanking adapter sequence and the 3′ flanking adapter sequence.
 16. Themethod of claim 9, wherein a percentage of correct assembly is at least65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99%.
 17. A method for nucleic acidassembly, comprising: (a) providing a plurality of double strandednucleic acids; (b) annealing a uracil at a 5′ end and a 3′ end of atleast two of the double stranded nucleic acids; (c) amplifying thedouble stranded nucleic acids using a uracil compatible polymerase toform amplification products; (d) mixing the amplification products fromstep (c) to form a mixture; and (e) amplifying the mixture from step (d)using a uracil incompatible polymerase to generate a single-strandednucleic acid.
 18. The method of claim 17, wherein the uracilincompatible polymerase is a DNA polymerase.
 19. The method of claim 17,wherein the homology sequence comprises about 20 to about 100 basepairs.
 20. The method of claim 17, wherein a percentage of correctassembly is at least 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99%.